Contextuality from missing and versioned data

نویسنده

  • Jason Morton
چکیده

Traditionally categorical data analysis (e.g. generalized linear models) works with simple, flat datasets akin to a single table in a database with no notion of missing data or conflicting versions. In contrast, modern data analysis must deal with distributed databases with many partial local tables that need not always agree. The computational agents tabulating these tables are spatially separated, with binding speed-of-light constraints and data arriving too rapidly for these distributed views ever to be fully informed and globally consistent. Contextuality is a mathematical property which describes a kind of inconsistency arising in quantum mechanics (e.g. in Bell’s theorem). In this paper we show how contextuality can arise in common data collection scenarios, including missing data and versioning (as in low-latency distributed databases employing snapshot isolation). In the companion paper, we develop statistical models adapted to this regime.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Testing Contextuality in Cyclic Psychophysical Systems of High Ranks

Contextuality-by-Default (CbD) is a mathematical framework for understanding the role of context in systems with deterministic inputs and random outputs. A necessary and sufficient condition for contextuality was derived for cyclic systems with binary outcomes. In quantum physics, the cyclic systems of ranks n = 5, 4, and 3 are known as systems of Klyachko-type, EPR-Bell-type, and Leggett-Garg-...

متن کامل

Influence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons

Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...

متن کامل

Versioning of Granulated Data in Hierarchically Composed Workspaces ∗

For the last 30 years there has been a lot of research of versioned software products, but challenges remain nevertheless. This article focuses on a model of versioned objects and hierarchically composed workspaces. The presented model of versioned object aims to solve the issue of granulation of versioned data. The model of hierarchically composed workspaces provides methods and rules for vers...

متن کامل

Optimal query/update tradeoffs in versioned dictionaries

External-memory dictionaries are a fundamental data structure in file systems and databases. Versioned (or fullypersistent) dictionaries have an associated version tree where queries can be performed at any version, updates can be performed on leaf versions, and any version can be ‘cloned’ by adding a child. Various query/update tradeoffs are known for unversioned dictionaries, many of them wit...

متن کامل

Adapting Scott and Bruce's General Decision-Making Style Inventory to Patient Decision Making in Provider Choice.

OBJECTIVE Research testing the concept of decision-making styles in specific contexts such as health care-related choices is missing. Therefore, we examine the contextuality of Scott and Bruce's (1995) General Decision-Making Style Inventory with respect to patient choice situations. METHODS Scott and Bruce's scale was adapted for use as a patient decision-making style inventory. In total, 38...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1708.03264  شماره 

صفحات  -

تاریخ انتشار 2017